Skip to content

Add LoRA handling for image generation#4084

Open
atobiszei wants to merge 36 commits into
mainfrom
atobisze_image_inpainting_lora
Open

Add LoRA handling for image generation#4084
atobiszei wants to merge 36 commits into
mainfrom
atobisze_image_inpainting_lora

Conversation

@atobiszei
Copy link
Copy Markdown
Collaborator

@atobiszei atobiszei commented Mar 25, 2026

This includes:
-> LoRA pulling
-> multiple LoRA handling
-> NPU LoRA handling

atobiszei added 24 commits March 2, 2026 16:28
mask field should only be accepted in image edit (inpainting) requests,
not in text-to-image generation requests.
- Add --source_loras CLI parameter for specifying LoRA adapters
  Format: alias=org/repo@file.safetensors (comma-separated, @file optional)
- Add LoRA adapter entries to image_gen_calculator.proto
- Parse and validate LoRA settings in image_generation_graph_cli_parser
- Export LoRA adapter entries in graph.pbtxt generation
- Load LoRA .safetensors via ov::genai::Adapter in pipelines.cpp
- Apply LoRA adapters at inference time based on model name routing
- Download LoRA repos via curl (resolve safetensors filename from HF API)
- Add LoRA alias routing in mediapipe factory
- Pass modelName through HttpPayload for LoRA alias matching
- Add 18 unit tests (CLI parsing, graph export, proto parsing, config)
- Support multiple LoRA source types: HF repo, direct URL, local file (alias= required)
- Extract shared curl_downloader utility from gguf_downloader
- Add composite LoRA aliases (e.g. blend=@pokemon:0.7+@anime:0.5)
- Support per-request lora_weights override in extra_body
- Local files referenced by absolute path in graph.pbtxt (no copy)
- HF LoRA: resolve .safetensors via API, download with curl
- clone() delegates to pullLoraAdapters() for all LoRA downloads
- resolveHfLoraFilenames() + pullLoraAdapters() split (private -> protected)
- Remove loraQueue: T2I/I2I always use clone(), only inpainting serialized
- PipelineSlotGuard (renamed from InpaintingQueueGuard)
- compileProperties built once in constructor (no default arg on reshapeAndCompile)
- CompositeLoraMap type alias replaces duplicate runtime structs
- Multiline composite formatting in graph.pbtxt
- Add RUN_UNSTABLE-gated pull tests for LoRA (HF resolve, download, full-flow)
- Add non-network unit tests (local file skip, non-imagegen no-op)
- Add SetUpServerForDownloadWithLoras test helper
- 59 tests pass (55 original + 4 new, 3 network-gated skip without RUN_UNSTABLE)
Resolved conflicts in:
- pipelines.hpp: keep PipelineSlotGuard name and LoRA fields, adopt main's blocking comment
- pipelines.cpp: keep LoRA adapter loading, compileProperties, adopt main's SPDLOG_ERROR
- http_image_gen_calculator.cc: keep LoRA logic, adopt main's const ref for inpainting tensors
- README.md: accept main's updated examples (model names, sizes, notes)
- Fix downloadFileWithCurl: use overload instead of const ref default parameter
  (was binding temporary to const std::string&)
- Add HF_TOKEN auth header to curl downloads for HF repos only
  (avoid leaking credentials to arbitrary DIRECT_URL servers)
- Rename authToken -> authTokenHF for clarity
- Skip RUN_UNSTABLE tests when HF_TOKEN is not set
- Provide explicit safetensors filename in download tests
- Restore missing 'curl -O' PNG download commands in image_generation README
- Update copilot-instructions rule 13: expanded dangling reference guidance
- Add missing #include <vector>, <utility> (cpplint)
- Fix comment spacing (cpplint)
- clang-format all changed files
MSVC /W4 treats variable shadowing as error (C4456).
Inner loop variable 'it' shadowed outer pipelinesMap iterator.
- Detect Windows absolute paths (e.g. C:\path\to\file.safetensors)
  in addition to Unix paths (/ and ./ prefixes)
- Also detect .\ prefix for relative Windows paths
- Use find_last_of("/\\") instead of rfind('/') to extract
  filename from both Unix and Windows paths
…ting_lora

# Conflicts:
#	src/mediapipe_internal/mediapipegraphdefinition.cpp
#	src/mediapipe_internal/mediapipegraphdefinition.hpp
#	src/pull_module/BUILD
#	src/server.cpp
#	src/test/graph_export_test.cpp
@dtrawins dtrawins added this to the 2026.2_rc milestone May 8, 2026
atobiszei added 5 commits May 13, 2026 10:29
…ll_hf_models

- Fix demos/image_generation/README.md: use adapter alias as model name
  instead of base model + lora_weights for LoRA selection
- Fix guidance_scale: 0 -> 0.0 (OVMS rejects integer values)
- Fix docs/image_generation/reference.md: clarify model name routing as
  the adapter selection mechanism, document blending via composite adapters
- Fix docs/model_server_rest_api_image_generation.md: clarify lora_weights
  only overrides weights of already-active adapters
- Add docs/pull_hf_models.md: section on pulling image gen models with LoRA
- Add static isValidLoraAlias() in CLI parser to sanitize LoRA alias names
  (alphanumeric, hyphens, underscores, dots only)
- Add ServableNameChecker collision detection in mediapipefactory when
  registering LoRA aliases (reject if alias shadows model/pipeline/graph name)
- Revert file_system_poll_wait_seconds default to 1 and
  sequence_cleaner_poll_wait_minutes default to 5
- Fix missing HfDownloaderPullHfModel test fixture after merge
- NPU detected: set AdapterConfig::MODE_STATIC, skip runtime adapter switching
- Reject composite LoRA adapters on NPU (runtime switching unavailable)
- Warn when multiple LoRAs configured on NPU (all compiled permanently)
- Rename npuLoraFused -> npuLoraStaticMode for accuracy
- Add CLI LoRA parsing tests: alias validation, source types, composites
- Add pbtxt composite LoRA test in text2image_test
- Add local file path tests (Unix absolute, Windows behind ifdef)
… weight->alpha

- Add aliasesConflict() to ServableNameChecker interface for LoRA alias
  collision detection during graph validation (before factory lock)
- Implement aliasesConflictExcluding() in MediapipeFactory with shared_lock
- Validate aliases in mediapipegraphdefinition validate() after initializeNodes
- Simplify createDefinition alias loop (checks moved to validate phase)
- Update reloadDefinition to clear+re-register aliases on reload
- NPU LoRA calculator rejection: reject requests to main graph name when
  npuLoraStaticMode is active (direct client to use alias)
- Multi-LoRA NPU: require composite_lora_adapters definition (hard error)
- Multi-LoRA NPU calculator: only composite aliases accepted as targets
- Rename CompositeLoraComponent.weight -> alpha across proto/struct/CLI/export
- Rename npuLoraFused -> npuLoraStaticMode
- Register composite aliases for routing in image_gen_node_initializer
- Fix fmt formatting of resolution_t in imagegen_init.cpp log statements
atobiszei added 3 commits May 14, 2026 16:43
…ting_lora

# Conflicts:
#	docs/pull_hf_models.md
Add LoraLoadMode enum to proto and C++ to support different LoRA
loading strategies:
- DYNAMIC (default): Runtime-switchable adapters
- STATIC: Static rank compilation
- FUSE: Permanently merge LoRA into base weights at compile time

FUSE adapters are compiled separately and excluded from runtime
alias registration. DYNAMIC/STATIC adapters remain switchable at
generate time.

Also fixes composite LoRA alias registration (skip FUSE adapters)
and adds tests for the new functionality.
- Rename expectedImageGenNpuFuse → expectedImageGenNpuStatic in tests
- Fix NPU error message: 'fused' → 'static' in imagegen_init.cpp
- Fix CLI parser comment: clarify STATIC mode for NPU adapters
- All 44 LoRA tests passing
@atobiszei atobiszei marked this pull request as ready for review May 18, 2026 07:34
Copilot AI review requested due to automatic review settings May 18, 2026 07:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends OVMS image generation to support LoRA adapters end-to-end: CLI parsing (--source_loras) and graph export, LoRA download during HF pull, pipeline compilation with adapters, per-request adapter selection via model name routing (including composites), and alias-based routing/visibility in the MediaPipe factory.

Changes:

  • Add LoRA adapter definitions (single + composite) to ImageGen graph proto, parsing, and graph export/CLI plumbing.
  • Implement HF pull support for LoRA adapters (HF repo resolution + download; direct URL/local path support via CLI parsing).
  • Add runtime routing support for LoRA aliases (MediaPipe alias registration/hide-base-model behavior) and request-level lora_weights overrides.

Reviewed changes

Copilot reviewed 44 out of 44 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/test/text2image_test.cpp Adds pbtxt parsing tests for LoRA adapter fields.
src/test/test_utils.hpp Declares new server test helpers for pull/start with LoRAs and REST port.
src/test/test_utils.cpp Implements new server test helpers (threaded start with LoRA args).
src/test/pull_hf_model_test.cpp Adds HF pull + LoRA tests and a large unstable pull/serve/generate integration test.
src/test/ovmsconfig_test.cpp Adds config parsing tests for invalid/valid --source_loras combinations.
src/test/graph_export_test.cpp Adds extensive graph export + CLI-to-settings tests for LoRA/composites/source types.
src/stringutils.hpp Declares isLocalFilePath.
src/stringutils.cpp Implements isLocalFilePath (Unix + Windows absolute + ./ .\).
src/server.cpp Adjusts HF pull module casting to call non-const clone().
src/servable_name_checker.hpp Extends checker interface with alias-conflict detection.
src/pull_module/hf_pull_model_module.hpp Makes clone() non-const; exposes LoRA resolve/pull helpers (protected).
src/pull_module/hf_pull_model_module.cpp Adds HF API resolution for LoRA safetensors + downloads during clone().
src/pull_module/gguf_downloader.cpp Refactors curl download logic to shared curl downloader helper.
src/pull_module/curl_downloader.hpp New shared curl download helper API.
src/pull_module/curl_downloader.cpp New curl downloader implementation (progress + optional auth header).
src/pull_module/BUILD Adds curl_downloader target; wires into pull module deps.
src/modelmanager.hpp Implements new aliasesConflict API.
src/modelmanager.cpp Adds alias conflict checks across models/pipelines/mediapipe definitions.
src/mediapipe_internal/mediapipegraphdefinition.hpp Stores discovered LoRA aliases + hide-base-model flag.
src/mediapipe_internal/mediapipegraphdefinition.cpp Validates LoRA alias conflicts; propagates LoRA routing metadata from node init.
src/mediapipe_internal/mediapipefactory.hpp Adds alias→graph mapping and helper methods.
src/mediapipe_internal/mediapipefactory.cpp Registers LoRA aliases for lookup/listing; hides base model when requested.
src/mediapipe_internal/graph_side_packets.hpp Extends side packets with LoRA aliases and hide-base-model flag.
src/image_gen/pipelines.hpp Adds adapter/composite storage; renames queue guard to PipelineSlotGuard.
src/image_gen/pipelines.cpp Loads adapters and compiles pipelines with adapter properties; tracks NPU/static mode.
src/image_gen/imagegenutils.cpp Allows lora_weights in accepted request fields.
src/image_gen/imagegenpipelineargs.hpp Adds LoRA adapter + composite settings to pipeline args.
src/image_gen/imagegen_init.cpp Parses LoRA adapter and composite entries from ImageGenCalculatorOptions.
src/image_gen/image_gen_node_initializer.cpp Registers LoRA aliases into graph side packets; sets hide-base-model.
src/image_gen/image_gen_calculator.proto Adds LoRA adapter and composite adapter proto fields + load mode enum.
src/image_gen/http_image_gen_calculator.cc Applies LoRA selection per request (model routing) + optional lora_weights.
src/http_rest_api_handler.cpp Persists resolved model name into HttpPayload.
src/http_payload.hpp Adds modelName to payload for downstream routing logic.
src/graph_export/image_generation_graph_cli_parser.cpp Adds --source_loras parsing (repo/url/local + composites + alpha).
src/graph_export/graph_export.cpp Emits LoRA adapter entries (and composite entries) into generated graph.pbtxt.
src/cli_parser.cpp Adds --source_loras CLI option and stores it in HF settings.
src/capi_frontend/server_settings.hpp Adds LoRA settings types + HFSettingsImpl::sourceLoras.
src/BUILD Adds cpp-httplib + image_generation_graph_cli_parser deps to tests.
docs/pull_hf_models.md Documents --source_loras for pull mode.
docs/model_server_rest_api_image_generation.md Documents lora_weights request field.
docs/image_generation/reference.md Adds LoRA adapter usage docs (routing, composites, overrides).
demos/image_generation/README.md Adds Multi-LoRA serving examples; improves inpainting/outpainting notes.
demos/common/export_models/export_model.py Adds --source_loras support for exporting image generation configs (with LoRA download).
.github/copilot-instructions.md Updates guidance about avoiding dangling refs in default args.
Comments suppressed due to low confidence (1)

src/test/test_utils.cpp:850

  • This overload builds argv using port.c_str() where port is a local variable, and argv itself is a stack array. The server thread may outlive this function, so the argument pointers can dangle (use-after-scope). Please ensure argument storage outlives the thread (heap-owned vectors captured by value).

Comment thread src/graph_export/image_generation_graph_cli_parser.cpp
Comment thread src/image_gen/pipelines.cpp
Comment thread src/pull_module/hf_pull_model_module.cpp Outdated
Comment thread src/pull_module/curl_downloader.cpp
Comment thread src/test/test_utils.cpp
Comment on lines +58 to +63
// All adapters were registered at compile time (alpha=1.0 each).
// At generate time we must explicitly set the adapter config:
// - If modelName matches a composite alias: activate all component adapters with their weights.
// - If modelName matches a single adapter alias: activate that adapter.
// - Otherwise: disable all adapters (alpha=0) so the base model runs clean.
// lora_weights from request body can override default weights.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already updated.

Comment thread src/test/pull_hf_model_test.cpp
- Add validateLoraAdapterConfig() for alpha consistency between
  individual and composite levels (error if both non-default)
- NPU validation: composites required for multi-LoRA, all adapters
  must be referenced, consistent alpha across composites
- Fix Windows drive letter colon in CLI parser (lastColon > 1)
- Document LoRA adapter modes (DYNAMIC/STATIC/FUSE) in reference.md
- Document --source_loras format, alpha, source type detection
- Add tests: alpha at individual/composite/both levels, explicit 1.0,
  Windows absolute path with alpha
@atobiszei atobiszei changed the title WIP LORA image generation Add LoRA handling for image generation May 18, 2026
// See the License for the specific language governing permissions and
// limitations under the License.
//*****************************************************************************
#include "curl_downloader.hpp"
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted from GGUF_downloader

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are almost the same. Can we make a base curlDownloader class ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The common functionality is already extracted into the downloadFileWithCurl() free function in curl_downloader.cpp, which both GGUF and LoRA paths call. The orchestration around it differs: GGUF handles multi-part file resolution and overwrite-remove logic; LoRA handles source-type dispatch (HF repo / URL / local),

Comment thread demos/common/export_models/export_model.py Outdated
**FUSE mode:**
- The adapter is merged into base weights during model compilation using `MODE_FUSE`.
- It is always active — the base model without the adapter is **not accessible**.
- Does not appear in the list of routable adapters and cannot be selected or deselected via the `model` field.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only adapter is available.

SttServableMap sttServableMap;
TtsServableMap ttsServableMap;
std::vector<std::string> loraAliases;
bool hideBaseModel = false;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe what is it used for.

Comment thread src/test/graph_export_test.cpp Outdated
ASSERT_EQ(std::get<Status>(res), ovms::StatusCode::PLUGIN_CONFIG_CONFLICTING_PARAMETERS);
}

// ===================== LoRA Graph Export Tests =====================
Copy link
Copy Markdown
Collaborator

@rasapala rasapala May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a new file for those specific lora tests. I do not see if we reuse much from this file ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removeVersionString().

I see it is already created in 2 places (both graph_export & hf_pull tests so I will extract it and share across all thre files then)

uint16_t n = 3;
testResponseFromOvTensor(n);
}
// ===================== LoRA Proto Parsing Tests =====================
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding new file with this tests.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I am not convinced - loras are mainly for image generation and here we test basically the same (proto parsing)

Comment on lines +138 to +139
std::vector<std::string> loraAliases_;
bool hideBaseModel_ = false;
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dispose "_"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants